Skip to main content
Version: v1.4.1

Shopify Order Return Prediction

Predict which orders are likely to be returned before they ship, enabling proactive interventions like quality checks, adjusted return policies, or targeted post-purchase support.

Dataset Source: Kaggle - Shopify Sales Dataset for ML & EDA Problem Type: Binary Classification (order-level) Target Variable: is_returned — 1 = order was returned, 0 = order kept Return Rate: ~14.8% (well-balanced for classification) Use Case: Flag high-risk orders at checkout for operational routing, fraud detection, or post-purchase follow-up

Package Imports

1import pandas as pd
2import numpy as np
3import xplainable as xp
4from xplainable.core.models import XClassifier
5from xplainable.core.optimisation.bayesian import XParamOptimiser
6from sklearn.model_selection import train_test_split
7import requests
8import json
9
10from xplainable_client.client.client import XplainableClient
11from xplainable_client.client.base import XplainableAPIError
12
13from xplainable_preprocessing import PipelineSpec, StepSpec, compile_spec

Instantiate Xplainable Cloud

Initialise the xplainable cloud using an API key from: https://platform.xplainable.io/

1# Initialize Xplainable Cloud client
2client = XplainableClient(
3 api_key="", # Create api key in xplainable cloud - https://platform.xplainable.io/
4 hostname="https://platform.xplainable.io"
5)

Load Shopify Order Data

Unlike the churn notebook, this model operates at the order level — each row is a single order, no aggregation needed. This makes the preprocessing pipeline fully self-contained and directly deployable.

1df = pd.read_csv("shopify_sales_dataset_ml_eda.csv", parse_dates=["order_date"])
2
3print(f"Orders: {len(df):,}")
4print(f"Return rate: {df['is_returned'].mean():.1%}")
5print(f"
6Return distribution:")
7print(df["is_returned"].value_counts())
8df.head()

1. Data Preprocessing

Columns dropped:

  • order_id, customer_id, product_id — highly cardinal identifiers with no predictive value
  • profit, revenue, discounted_price — derived from other columns (price, discount, quantity) and would be data leakage since profit is affected by the return itself
  • order_date — extracted into month and day-of-week features first

Features retained:

  • product_category, product_price, discount_percent, quantity — order characteristics
  • customer_country, traffic_source, payment_method — context
  • shipping_cost, rating — potential return signals
  • order_month, order_dow — temporal patterns
1preprocessing_spec = PipelineSpec(steps=[
2 # Extract temporal features from order_date before dropping it
3 StepSpec(
4 id="extract_datetime",
5 type="DateTimeExtractTransformer",
6 columns=["order_date"],
7 params={
8 "components": ["month", "dayofweek"],
9 "drop_original": True,
10 },
11 description="Extract month and day-of-week from order date",
12 ),
13
14 # Lowercase all categorical columns
15 StepSpec(
16 id="lowercase_categoricals",
17 type="TextCleanTransformer",
18 columns=["product_category", "customer_country",
19 "traffic_source", "payment_method"],
20 params={"operations": ["lowercase"]},
21 description="Standardize categorical values to lowercase",
22 ),
23
24 # Drop IDs and leakage columns
25 StepSpec(
26 id="drop_columns",
27 type="DropColumnsTransformer",
28 params={"columns": [
29 "order_id", # Highly cardinal
30 "customer_id", # Highly cardinal
31 "product_id", # Highly cardinal
32 "discounted_price", # Redundant with product_price + discount_percent
33 "revenue", # Derived from discounted_price * quantity
34 "profit", # Leakage — profit is affected by the return
35 ]},
36 description="Drop IDs, redundant, and leakage columns",
37 ),
38])
39
40pipeline = compile_spec(preprocessing_spec)
41df_transformed = pipeline.fit_transform(df)
42
43print(f"Transformed shape: {df_transformed.shape}")
44print(f"Columns: {list(df_transformed.columns)}")
45df_transformed.head()

Persist Preprocessor to Xplainable Cloud

Since this is an order-level model with no aggregation step, the entire preprocessing pipeline is self-contained and can be persisted directly.

1try:
2 preprocessor_id, preprocessor_version_id = client.preprocessing.create_preprocessor(
3 name="Shopify Returns Preprocessing",
4 description="Order-level feature transforms for return prediction. "
5 "Extracts datetime features, lowercases categoricals, drops IDs and leakage columns.",
6 spec=preprocessing_spec.model_dump(),
7 sample_df=df,
8 )
9 print(f"Preprocessor created: {preprocessor_id} (version: {preprocessor_version_id})")
10except (XplainableAPIError, ValueError) as e:
11 print(f"Error creating preprocessor: {e}")
12 preprocessor_id, preprocessor_version_id = None, None

Train/Test Split

1X, y = df_transformed.drop(columns=["is_returned"]), df_transformed["is_returned"]
2
3X_train, X_test, y_train, y_test = train_test_split(
4 X, y, test_size=0.33, random_state=42
5)
6
7print(f"Train: {X_train.shape[0]:,} samples | Return rate: {y_train.mean():.1%}")
8print(f"Test: {X_test.shape[0]:,} samples | Return rate: {y_test.mean():.1%}")

2. Model Optimisation

1opt = XParamOptimiser()
2params = opt.optimise(X_train, y_train)

3. Model Training

1model = XClassifier(**params)
2model.fit(X_train, y_train)

4. Model Interpretability and Explainability

Which order characteristics are most predictive of returns? The explainer reveals whether price, discount level, product category, or other factors drive return risk.

1model.explain()

5. Model Evaluation

1from sklearn.metrics import accuracy_score, classification_report, f1_score
2
3y_pred = model.predict(X_test)
4
5print(f"Test Accuracy: {accuracy_score(y_test, y_pred):.4f}")
6print(f"Macro F1: {f1_score(y_test, y_pred, average='macro'):.4f}")
7print()
8print(classification_report(y_test, y_pred, target_names=["Kept", "Returned"], digits=3))

6. Model Persisting

1try:
2 model_id, version_id = client.models.create_model(
3 model=model,
4 model_name="Shopify Order Return Prediction",
5 model_description="Predicting which Shopify orders are likely to be returned based on order characteristics, product details, and customer context.",
6 x=X_train,
7 y=y_train
8 )
9 print(f"Model created: {model_id} (version: {version_id})")
10except XplainableAPIError as e:
11 print(f"Error creating model: {e}")
12 model_id, version_id = None, None

7. Model Deployment

1if model_id and version_id:
2 try:
3 deployment_response = client.deployments.deploy(
4 model_version_id=version_id
5 )
6 deployment_id = deployment_response.deployment_id
7 except XplainableAPIError as e:
8 print(f"Error deploying model: {e}")
9 deployment_id = None
10else:
11 deployment_id = None

8. Testing the Deployment

1# Activate and generate key
2if deployment_id:
3 try:
4 client.deployments.activate_deployment(deployment_id=deployment_id)
5 deploy_key = client.deployments.generate_deploy_key(
6 deployment_id=deployment_id,
7 description="API key for Shopify Returns",
8 days_until_expiry=1
9 )
10 print(f"Deploy key created: {str(deploy_key)}")
11 except XplainableAPIError as e:
12 print(f"Error: {e}")
13 deploy_key = None
14else:
15 deploy_key = None
16 print("Deployment ID not available")
1# Test prediction with a sample order
2body = json.loads(X_test.sample(1).to_json(orient="records"))
3print("Sample payload:")
4print(json.dumps(body, indent=2))
5
6if deploy_key and body:
7 response = requests.post(
8 url="https://inference.xplainable.io/v1/predict",
9 headers={"api_key": str(deploy_key)},
10 json=body
11 )
12 print(f"
13Prediction result: {response.json()}")
14else:
15 print("
16Deploy key or body not available for prediction")

9. AI-Generated Report

1if model_id and version_id:
2 report = client.gpt.generate_report(
3 model_id=model_id,
4 version_id=version_id,
5 target_description="Order return likelihood (1 = will be returned, 0 = will be kept)",
6 project_objective="Identify high-risk orders to enable proactive quality checks, adjusted return policies, and targeted post-purchase support",
7 max_features=10,
8 temperature=0.7
9 )
10
11 from IPython.display import Markdown, display
12 display(Markdown(report.body))
13else:
14 print("Model not persisted — skipping report generation")

10. Contribution-Driven Return Optimization

The xplainable model's per-feature contributions explain why each order is at risk. Several features in the dataset represent controllable business levers:

  • shipping_cost — the business can offer free or subsidized shipping
  • discount_percent — the business decides whether to discount
  • quantity — bundle incentives can increase items per order
  • product_price — pricing strategy is controllable

The model's partition profiles give us the measured return rate shift when a feature moves from one partition to another. We use these counterfactual shifts as lever effects — derived from the data, not assumed.

Extract Contributions and Counterfactual Lever Effects

For each controllable feature, the lever effect = current contribution - best achievable partition score. This tells us how much the return probability would drop if we moved that order to the best partition for that feature.

1# Get per-feature contributions for every test order
2contributions = model._transform(X_test)
3contrib_df = pd.DataFrame(contributions, columns=model.columns, index=X_test.index)
4
5# Return probability = base_value + sum of contributions
6base_value = model.profile['base_value']
7contrib_df['return_probability'] = (contributions.sum(axis=1) + base_value).clip(0, 1)
8
9# Define controllable features (things the business can influence)
10controllable_features = [
11 'shipping_cost', 'discount_percent', 'quantity', 'product_price',
12 'product_category', 'traffic_source', 'payment_method',
13]
14
15# Find the best (most retention-leaning) partition score for each controllable feature
16profile = model.profile
17best_partition_scores = {}
18for feat in controllable_features:
19 scores = []
20 if feat in profile['numeric']:
21 scores = [p['score'] for p in profile['numeric'][feat]
22 if not (isinstance(p.get('lower', 0), float) and np.isnan(p.get('lower', 0)))]
23 elif feat in profile['categorical']:
24 scores = [p['score'] for p in profile['categorical'][feat]
25 if p.get('category') != 'Null']
26 if scores:
27 best_partition_scores[feat] = min(scores)
28
29# Compute lever effect per order per feature
30lever_effects = pd.DataFrame(index=X_test.index)
31for feat in controllable_features:
32 if feat in best_partition_scores and feat in contrib_df.columns:
33 lever_effects[feat] = contrib_df[feat] - best_partition_scores[feat]
34
35# For each order, identify the feature with the biggest improvement potential
36lever_effects['best_lever'] = lever_effects[controllable_features].idxmax(axis=1)
37lever_effects['lever_effect'] = lever_effects[controllable_features].max(axis=1)
38
39print(f"Base return rate: {base_value:.1%}")
40print(f"
41Best partition scores (most retention-leaning):")
42for feat, score in sorted(best_partition_scores.items(), key=lambda x: x[1]):
43 print(f" {feat:25s} best={score:+.4f}")
44
45print(f"
46Best lever distribution (which feature offers most improvement per order):")
47print(lever_effects['best_lever'].value_counts().to_string())
48
49print(f"
50Average return reduction by best lever:")
51summary = lever_effects.groupby('best_lever')['lever_effect'].agg(['count', 'mean'])
52summary.columns = ['orders', 'avg_return_reduction']
53print(summary.sort_values('avg_return_reduction', ascending=False).round(4).to_string())

Map Levers to Actions and Costs

Each controllable feature maps to a business action. The lever effect is from the model (data-driven), the cost is a business input (replace with your actuals).

1# Map controllable features to business actions
2# Lever effect = from model partitions (data-driven)
3# Cost = business input (replace with your actual costs)
4lever_actions = {
5 "shipping_cost": {"action": "Free/subsidized shipping", "cost": 5.00},
6 "discount_percent": {"action": "Targeted discount offer", "cost": 3.00},
7 "quantity": {"action": "Bundle / packaging upgrade", "cost": 1.00},
8 "product_price": {"action": "Price-match guarantee", "cost": 0.30},
9 "product_category": {"action": "Pre-ship quality check", "cost": 2.00},
10 "traffic_source": {"action": "Product video / sizing info", "cost": 0.50},
11 "payment_method": {"action": "Post-purchase confirmation SMS", "cost": 1.50},
12}
13
14# Average cost of processing a return (shipping + restocking + CS time)
15AVG_RETURN_COST = 25.00
16
17# Build optimization DataFrame
18optimization = lever_effects[['best_lever', 'lever_effect']].copy()
19optimization['return_prob'] = contrib_df['return_probability']
20optimization['order_value'] = X_test['product_price'] * X_test['quantity']
21
22optimization['action'] = optimization['best_lever'].map(
23 lambda f: lever_actions.get(f, {}).get('action', 'General follow-up')
24)
25optimization['lever_cost'] = optimization['best_lever'].map(
26 lambda f: lever_actions.get(f, {}).get('cost', 1.00)
27)
28
29print("Action assignment:")
30print(optimization.groupby('action').agg(
31 orders=('action', 'count'),
32 avg_return_prob=('return_prob', 'mean'),
33 avg_lever_effect=('lever_effect', 'mean'),
34 avg_cost=('lever_cost', 'mean'),
35).sort_values('orders', ascending=False).round(3).to_string())

Expected Value Optimization

The net EV uses the model-derived lever effect as the return prevention rate:

Net EV = lever_effect x avg_return_cost - lever_cost

Where lever_effect is the counterfactual return probability reduction from the model's partition profile — measured from the data. avg_return_cost is the operational cost of processing a return (shipping, restocking, CS time).

1# Net EV = lever_effect (from model) * avg_return_cost - lever_cost (from business)
2optimization['expected_savings'] = (optimization['lever_effect'] * AVG_RETURN_COST).round(2)
3optimization['net_ev'] = (optimization['expected_savings'] - optimization['lever_cost']).round(2)
4optimization['roi'] = np.where(
5 optimization['lever_cost'] > 0,
6 (optimization['net_ev'] / optimization['lever_cost']).round(2),
7 0
8)
9
10positive_ev = optimization[optimization['net_ev'] > 0].copy()
11negative_ev = optimization[optimization['net_ev'] <= 0].copy()
12
13print(f"Orders worth intervening on: {len(positive_ev):,} ({len(positive_ev)/len(optimization):.1%})")
14print(f"Orders to skip (negative EV): {len(negative_ev):,} ({len(negative_ev)/len(optimization):.1%})")
15print(f"
16--- Portfolio Summary (positive-EV orders only) ---")
17print(f"Total intervention cost: ${positive_ev['lever_cost'].sum():>10,.2f}")
18print(f"Total expected savings: ${positive_ev['expected_savings'].sum():>10,.2f}")
19print(f"Total net EV: ${positive_ev['net_ev'].sum():>10,.2f}")
20if positive_ev['lever_cost'].sum() > 0:
21 print(f"Portfolio ROI: {positive_ev['net_ev'].sum() / positive_ev['lever_cost'].sum():.1f}x")
1# Breakdown by action
2print("Net EV by action (data-driven lever effects):
3")
4action_summary = positive_ev.groupby('action').agg(
5 orders=('action', 'count'),
6 avg_lever_effect=('lever_effect', 'mean'),
7 total_cost=('lever_cost', 'sum'),
8 total_savings=('expected_savings', 'sum'),
9 total_net_ev=('net_ev', 'sum'),
10 avg_roi=('roi', 'mean'),
11).sort_values('total_net_ev', ascending=False).round(2)
12
13action_summary

Budget-Constrained Allocation

Rank orders by ROI and allocate a fixed budget to the highest-return interventions first.

1# Rank by ROI and apply budget constraints
2portfolio = positive_ev.sort_values('roi', ascending=False).copy()
3portfolio['cumulative_cost'] = portfolio['lever_cost'].cumsum()
4
5budget_levels = [100, 250, 500, 1000, 2500, 5000]
6
7budget_analysis = []
8for budget in budget_levels:
9 within = portfolio[portfolio['cumulative_cost'] <= budget]
10 if len(within) > 0:
11 total_cost = within['lever_cost'].sum()
12 budget_analysis.append({
13 'budget': f'${budget:,}',
14 'orders_covered': len(within),
15 'cost_used': round(total_cost, 2),
16 'expected_savings': round(within['expected_savings'].sum(), 2),
17 'net_ev': round(within['net_ev'].sum(), 2),
18 'roi': f"{within['net_ev'].sum() / total_cost:.1f}x",
19 })
20
21budget_df = pd.DataFrame(budget_analysis)
22print("Budget Allocation — orders ranked by ROI:
23")
24budget_df
1# Top 10 highest-value orders
2print("Top 10 highest net-EV orders:
3")
4positive_ev.nlargest(10, 'net_ev')[
5 ['return_prob', 'best_lever', 'lever_effect', 'action',
6 'lever_cost', 'expected_savings', 'net_ev']
7].round(3)

What's Data-Driven vs What's Assumed

From the model (data-driven):

  • Return probability per order (base_value + contributions)
  • Per-feature contribution scores explaining why each order is at risk
  • Counterfactual lever effects: how much return probability changes if a feature moves from its current partition to the best partition
  • Which lever offers the most improvement for each specific order

Business inputs (replace with your actuals):

  • Intervention costs ($0.50 for a video, $5 for free shipping, etc.)
  • Average return processing cost ($25 in this example)
  • The assumption that moving a feature to a better partition is achievable via the mapped action